Spectral entropy-based voice activity detector for videoconferencing systems
نویسندگان
چکیده
This paper proposes a statistical voice activity detector (VAD) suitable for videoconferencing applications, where detection of higher level speech activities, e.g., sentences instead of syllables, words, phrases, etc, is useful. The proposed method uses two distinct features for VAD, energy and entropy in the decorrelated domain, which are modeled as chi-square and Gaussian distributions respectively. Voice activities are determined by finding the joint probability of this statistical model as a soft measure. Experimental results show that the proposed method is suitable for detecting high level speech activities compared to the traditional methods used for speech coding or automatic speech recognition.
منابع مشابه
A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملUsing Context-based Statistical Models to Promote the Quality of Voice Conversion Systems
This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...
متن کاملExploring Non-linear Transformations for an Entropy- based Voice Activity Detector
In this paper we explore the use of non-linear transformations in order to improve the performance of an entropy based voice activity detector (VAD). The idea of using a non-linear transformation comes from some previous work done in speech linear prediction (LPC) field based in source separation techniques, where the score function was added into the classical equations in order to take into a...
متن کاملEnergy and entropy based switching algorithm for speech endpoint detection in varying SNR conditions
In this work, we present an algorithm that switches between the energy and the entropy based voice activity detectors (VADs) to provide an improved performance under varying signal to noise ratio (SNR) conditions. The motivation for switching has come from the observed complementary behavior in the noise estimation performances of energy and entropy based voice activity detectors when evaluated...
متن کاملA multichannel speech/silence detector based on time delay estimation and fuzzy classification
Discontinuous transmission based on speech/pause detection represents a valid solution to improve the spectral efficiency of new-generation wireless communication systems. In this context, robust Voice Activity Detection (VAD) algorithms are required, as traditional solutions present a high misclassification rate in the presence of the background noise typical of mobile environments. The Fuzzy ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010